NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis

https://doi.org/10.1093/jamia/ocab270

Nelson, Charlotte A; Bove, Riley; Butte, Atul J; Baranzini, Sergio E (December 2021, Journal of the American Medical Informatics Association)

Abstract ObjectiveEarly identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient’s health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on “black box” algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph. Materials and MethodsA modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease. ResultsOur model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS. ConclusionUsing data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state.
more » « less
Full Text Available
Knowledge Network Embedding of Transcriptomic Data from Spaceflown Mice Uncovers Signs and Symptoms Associated with Terrestrial Diseases

https://doi.org/10.3390/life11010042

Nelson, Charlotte A.; Acuna, Ana Uriarte; Paul, Amber M.; Scott, Ryan T.; Butte, Atul J.; Cekanaviciute, Egle; Baranzini, Sergio E.; Costes, Sylvain V. (January 2021, Life)

There has long been an interest in understanding how the hazards from spaceflight may trigger or exacerbate human diseases. With the goal of advancing our knowledge on physiological changes during space travel, NASA GeneLab provides an open-source repository of multi-omics data from real and simulated spaceflight studies. Alone, this data enables identification of biological changes during spaceflight, but cannot infer how that may impact an astronaut at the phenotypic level. To bridge this gap, Scalable Precision Medicine Oriented Knowledge Engine (SPOKE), a heterogeneous knowledge graph connecting biological and clinical data from over 30 databases, was used in combination with GeneLab transcriptomic data from six studies. This integration identified critical symptoms and physiological changes incurred during spaceflight.
more » « less
Full Text Available
Progress toward a universal biomedical data translator

https://doi.org/10.1111/cts.13301

Fecho, Karamarie; Thessen, Anne E.; Baranzini, Sergio E.; Bizon, Chris; Hadlock, Jennifer J.; Huang, Sui; Roper, Ryan T.; Southall, Noel; Ta, Casey; Watkins, Paul B.; et al (August 2022, Clinical and Translational Science)

Abstract Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well‐being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline‐specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph‐based “Translator” system capable of integrating existing biomedical data sets and “translating” those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system’s architecture, performance, and quality of results. We apply Translator to several real‐world use cases developed in collaboration with subject‐matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state‐of‐the‐art, biomedical graph‐based question‐answering systems.
more » « less
Full Text Available
STREAMS guidelines: standards for technical reporting in environmental and host-associated microbiome studies

https://doi.org/10.1038/s41564-025-02186-2

Kelliher, Julia M; Mirzayi, Chloe; Bordenstein, Sarah R; Oliver, Aaron; Kellogg, Christina A; Hatcher, Eneida L; Berg, Maureen; Baldrian, Petr; Aljumaah, Mashael; Miller, Cassandra_Maria Luz; et al (December 2025, Nature Microbiology)

Free, publicly-accessible full text available December 1, 2026
A biomedical open knowledge network harnesses the power of AI to understand deep human biology

https://doi.org/10.1002/aaai.12037

Baranzini, Sergio E.; Börner, Katy; Morris, John; Nelson, Charlotte A.; Soman, Karthik; Schleimer, Erica; Keiser, Michael; Musen, Mark; Pearce, Roger; Reza, Tahsin; et al (March 2022, AI Magazine)

Abstract Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article, we describe concrete uses of Scalable PrecisiOn Medicine Knowledge Engine (SPOKE), an open knowledge network that connects curated information from thirty‐seven specialized and human‐curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID‐19 research and chronic disease diagnosis, and management.
more » « less

Search for: All records